Graphs for post-test survey analysis for city governments.
Install necessary packages
Code
# Set a CRAN mirroroptions(repos =c(CRAN ="https://cloud.r-project.org"))# Install necessary packages# install.packages("magrittr")# install.packages("dplyr")# install.packages("ggplot2")# install.packages("knitr")# install.packages("kableExtra")# install.packages("survey")# install.packages("plotly")# Load the packageslibrary(magrittr)library(dplyr)library(plotly)library(ggplot2)library(knitr)library(kableExtra)library(survey)
Survey design weights
Code
# Read the data from the CSV filecohort3FinalGov3 <-read.csv("~/Documents/R_Projects/TOPC-impact-evaluation-dashboard/Cohort_3/cohort3FinalGov3.csv") # Define population and sample sizespopulation_counts <-c(Akron =8, Detroit =9, Macon =10, Miami =13)sample_counts <-c(Akron =4, Detroit =2, Macon =4, Miami =3)# Total population and sample sizetotal_population <-sum(population_counts)total_sample <-sum(sample_counts)# Calculate the weightsweights <- (population_counts / total_population) / (sample_counts / total_sample)# Print the calculated weightsprint(weights)
Akron Detroit Macon Miami
0.650000 1.462500 0.812500 1.408333
Code
# Convert Likert scale questions to numericlikert_questions <-c("q4", "q5", "q6", "q7", "q9", "q10", "q11", "q12", "q13", "q14", "q15", "q16", "q17", "q20", "q21")for (question in likert_questions) { cohort3FinalGov3[[question]] <-as.numeric(as.character(cohort3FinalGov3[[question]]))}# Map between city names city_map <-list("Miami-Dade County, FL"="Miami","Akron, OH"="Akron","Detroit, MI"="Detroit","Macon-Bibb County, GA"="Macon")# Update dataframe to use the mapped namescohort3FinalGov3$city_mapped <-unlist(lapply(cohort3FinalGov3$q3, function(x) city_map[[x]]))# Assign weights using the mapped city namescohort3FinalGov3$weights <- weights[cohort3FinalGov3$city_mapped]# Define the survey designdesign <-svydesign(ids =~1, strata =~city_mapped, weights =~weights, data = cohort3FinalGov3)
Given this in the data:
Table 1. Count of population and sample participants in city governments in the TOPC program in Cohort 3
City Teams
Population Count
Sample Count
Akron
8
4
Detroit
9
2
Macon
10
4
Miami
13
3
Total
40
13
The formula I used to align proportions of population to proportions of sample)
\(\text{Weight of strata} = \frac{\frac{\text{Strata Population}}{\text{Total Population}}}{\frac{\text{Strata Sample}}{\text{Total Sample}}}\)
\(\text{Weight for Akron} = \frac{\frac{8}{40}}{\frac{4}{14}} = \frac{0.2}{0.2857} \approx 0.700\)
\(\text{Weight for Detroit} = \frac{\frac{9}{40}}{\frac{2}{14}} = \frac{0.225}{0.1429} \approx 1.575\)
\(\text{Weight for Macon} = \frac{\frac{10}{40}}{\frac{4}{14}} = \frac{0.25}{0.2857} \approx 0.875\)
\(\text{Weight for Miami} = \frac{\frac{13}{40}}{\frac{3}{14}} = \frac{0.325}{0.2143} \approx 1.517\)
Descriptive data analysis
For Q4
Please note that the absence of confidence intervals and standard errors for the median in the strata of Akron, Detroit, and Miami could be due to the svyquantile function not being able to compute these values with the available data. This is typically due to small sample sizes or other limitations in the data. The confidence interval for Macon, being precise, might suggest limited responses or a lack of variation in the responses for that stratum.
Akron
Median: The median response is 3, which means the central tendency of the responses is “Agree”.
Standard Deviation: The standard deviation is approximately 0.50. This value indicates moderate variability in the responses, suggesting that while “Agree” is the median response, there is some dispersion around this central value.
Detroit
Median: The median response is 3, similar to Akron, indicating “Agree” as the central response.
Standard Deviation: The standard deviation is approximately 0.71, which suggests a slightly higher variability in responses compared to Akron.
Macon
Median: The median response is 3 with a confidence interval from 3 to 3, which indicates a very consistent agreement among the participants, as the central response is “Agree” with little to no variability.
Standard Deviation: The standard deviation is 0, which means there is no variability among the responses, and all respondents provided the same answer.
Miami
Median: The median response is also 3, indicating “Agree” as the central response.
Standard Deviation: The standard deviation is approximately 0.58, suggesting a moderate level of variability, similar to Akron.
Code
# Define the stratastrata <-c("Akron", "Detroit", "Macon", "Miami")# Initialize a data frame to store the results for question q4results_df <-data.frame(Stratum =character(),Median =numeric(),StandardDeviation =numeric(),stringsAsFactors =FALSE)# Calculate stats for question q4 across each stratumfor (stratum in strata) { stratum_design <-subset(design, city_mapped == stratum) median_val <-svyquantile(~q4, stratum_design, 0.5, na.rm =TRUE) sd_val <-sqrt(svyvar(~q4, stratum_design, na.rm =TRUE))# Extract the median and standard deviation values median_q4 <- median_val[1] # Extracting the median value for q4 sd_q4 <- sd_val[1] # Extracting the SD value for q4# Append the results to the data frame results_df <-rbind(results_df, data.frame(Stratum = stratum,Median = median_q4,StandardDeviation = sd_q4 ))}# Print the results for q4print(results_df)
Stratum Median.q4.quantile Median.q4.ci.2.5 Median.q4.ci.97.5 Median.q4.se
0.5 Akron 3 NaN NaN NaN
0.51 Detroit 3 NaN NaN NaN
0.52 Macon 3 3 3 0
0.53 Miami 3 NaN NaN NaN
StandardDeviation
0.5 0.5000000
0.51 0.7071068
0.52 0.0000000
0.53 0.5773503
Code
# Define the strata and questionsstrata <-c("Akron", "Detroit", "Macon", "Miami")questions <-c("q4", "q5", "q6", "q7", "q9", "q10", "q11", "q12", "q13", "q14", "q15", "q16", "q17", "q20", "q21")# Initialize a data frame to store the resultsresults_df <-data.frame(Question =character(),Stratum =character(),Median =numeric(),StandardDeviation =numeric(),stringsAsFactors =FALSE)# Calculate stats for each question across each stratumfor (question in questions) {for (stratum in strata) {# Create the design for the current stratum stratum_design <-subset(design, city_mapped == stratum)# Compute the median and standard deviation for the current question median_val <-svyquantile(~get(question), stratum_design, 0.5, na.rm =TRUE) sd_val <-sqrt(svyvar(~get(question), stratum_design, na.rm =TRUE))# Extract the median and standard deviation values median_question <- median_val[1] # Extracting the median value for the question sd_question <- sd_val[1] # Extracting the SD value for the question# Append the results to the data frame results_df <-rbind(results_df, data.frame(Question = question,Stratum = stratum,Median = median_question,StandardDeviation = sd_question )) }}# Print the results# print(results_df)library(knitr)kable(results_df, caption ="Post-test Survey Results by City and Question", align ='c', format ="html")
Post-test Survey Results by City and Question
Question
Stratum
Median.get.question..quantile
Median.get.question..ci.2.5
Median.get.question..ci.97.5
Median.get.question..se
StandardDeviation
0.5
q4
Akron
3
NaN
NaN
NaN
0.5000000
0.51
q4
Detroit
3
NaN
NaN
NaN
0.7071068
0.52
q4
Macon
3
3
3
0
0.0000000
0.53
q4
Miami
3
NaN
NaN
NaN
0.5773503
0.54
q5
Akron
3
NaN
NaN
NaN
0.5773503
0.55
q5
Detroit
3
3
3
0
0.0000000
0.56
q5
Macon
3
NaN
NaN
NaN
0.5000000
0.57
q5
Miami
3
NaN
NaN
NaN
0.5773503
0.58
q6
Akron
3
NaN
NaN
NaN
0.5773503
0.59
q6
Detroit
3
3
3
0
0.0000000
0.510
q6
Macon
3
NaN
NaN
NaN
0.5000000
0.511
q6
Miami
4
4
4
0
0.5773503
0.512
q7
Akron
3
NaN
NaN
NaN
0.5773503
0.513
q7
Detroit
3
3
3
0
0.0000000
0.514
q7
Macon
3
NaN
NaN
NaN
0.9574271
0.515
q7
Miami
4
4
4
0
0.5773503
0.516
q9
Akron
3
NaN
NaN
NaN
0.5000000
0.517
q9
Detroit
3
3
3
0
0.0000000
0.518
q9
Macon
3
NaN
NaN
NaN
0.5000000
0.519
q9
Miami
3
NaN
NaN
NaN
0.5773503
0.520
q10
Akron
3
NaN
NaN
NaN
0.5000000
0.521
q10
Detroit
3
NaN
NaN
NaN
0.7071068
0.522
q10
Macon
3
NaN
NaN
NaN
0.5000000
0.523
q10
Miami
3
NaN
NaN
NaN
0.5773503
0.524
q11
Akron
3
3
3
0
0.0000000
0.525
q11
Detroit
3
NaN
NaN
NaN
0.7071068
0.526
q11
Macon
3
NaN
NaN
NaN
0.5000000
0.527
q11
Miami
4
4
4
0
0.5773503
0.528
q12
Akron
3
NaN
NaN
NaN
0.5000000
0.529
q12
Detroit
2
NaN
NaN
NaN
1.4142136
0.530
q12
Macon
3
3
3
0
0.5000000
0.531
q12
Miami
3
NaN
NaN
NaN
0.5773503
0.532
q13
Akron
3
3
3
0
0.0000000
0.533
q13
Detroit
2
NaN
NaN
NaN
0.7071068
0.534
q13
Macon
3
NaN
NaN
NaN
0.5000000
0.535
q13
Miami
3
NaN
NaN
NaN
0.5773503
0.536
q14
Akron
3
NaN
NaN
NaN
0.5000000
0.537
q14
Detroit
2
NaN
NaN
NaN
0.7071068
0.538
q14
Macon
3
NaN
NaN
NaN
0.5000000
0.539
q14
Miami
3
NaN
NaN
NaN
0.5773503
0.540
q15
Akron
4
4
4
0
0.5000000
0.541
q15
Detroit
3
3
3
0
0.0000000
0.542
q15
Macon
3
NaN
NaN
NaN
0.5773503
0.543
q15
Miami
3
NaN
NaN
NaN
1.0000000
0.544
q16
Akron
3
NaN
NaN
NaN
0.5000000
0.545
q16
Detroit
3
3
3
0
0.0000000
0.546
q16
Macon
3
NaN
NaN
NaN
0.9574271
0.547
q16
Miami
3
3
3
0
0.5773503
0.548
q17
Akron
3
3
3
0
0.0000000
0.549
q17
Detroit
2
NaN
NaN
NaN
0.7071068
0.550
q17
Macon
3
NaN
NaN
NaN
0.5773503
0.551
q17
Miami
3
NaN
NaN
NaN
0.5773503
0.552
q20
Akron
3
3
3
0
0.5000000
0.553
q20
Detroit
3
3
3
0
0.0000000
0.554
q20
Macon
3
NaN
NaN
NaN
0.5000000
0.555
q20
Miami
3
NaN
NaN
NaN
0.5773503
0.556
q21
Akron
4
4
4
0
0.5000000
0.557
q21
Detroit
3
3
3
0
0.0000000
0.558
q21
Macon
3
NaN
NaN
NaN
0.5773503
0.559
q21
Miami
4
4
4
0
0.5773503
Participant Background
Q3. Which city/county are you representing?
Code
# Single-variable frequency plotq3_frequency <-svytable(~q3, design = design)ggplot(as.data.frame(q3_frequency), aes(x = q3, y = Freq)) +geom_bar(stat ="identity", fill ="#041e42", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +labs(title ="All city governments participated in the survey with the highest number of respondents from Miami-Dade and Macon-Bibb Counties.",subtitle =paste("Total number of survey respondents per city team (n =", total_sample, ")"),x ="",y ="Number of Responses") +theme_minimal() +theme(legend.position ="bottom",legend.box ="vertical",legend.box.margin =margin(0, 0, 0, 0),legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(nrow =1, byrow =TRUE))
Q25. Which title best describes your role and level of seniority at work?
Code
# Single-variable frequency plotq25_frequency <-svytable(~q25, design = design)ggplot(as.data.frame(q25_frequency), aes(x = q25, y = Freq)) +geom_bar(stat ="identity", fill ="#041e42", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +labs(title ="Majority of respondents are senior officials or managers.",subtitle =paste("Total number of survey respondents (n =", total_sample, ")"),x ="",y ="Number of Responses") +theme_minimal() +theme(legend.position ="bottom",legend.box ="vertical",legend.box.margin =margin(0, 0, 0, 0),legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(nrow =1, byrow =TRUE))
Q27. How many years have you worked at your respective City or County?
Code
# Single-variable frequency plotq27_frequency <-svytable(~q27, design = design)ggplot(as.data.frame(q27_frequency), aes(x = q27, y = Freq)) +geom_bar(stat ="identity", fill ="#041e42", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +labs(title ="Majority are new to working in their city or county governments.",subtitle =paste("Total number of survey respondents (n =", total_sample, ")"),x ="",y ="Number of Responses") +theme_minimal() +theme(legend.position ="bottom",legend.box ="vertical",legend.box.margin =margin(0, 0, 0, 0),legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(nrow =1, byrow =TRUE))
Code
# Cross-tabulation with citiesq27_cross_tab_city <-svytable(~q27 + city_mapped, design = design)ggplot(as.data.frame(q27_cross_tab_city), aes(x = city_mapped, y = Freq, fill = q27)) +geom_bar(stat ="identity", position ="stack", width =0.7) +coord_flip() +scale_fill_manual(values =c("#041e42","#6eaddc","#da291c","#1B786E"),limits =c("0-2 years", "3-5 years", "6-9 years", "10+ years")) +labs(title ="Majority are new to working to city and county governments, especially in Macon and Detroit.",subtitle =paste("Years of Experience: Range from 0 to 10+ years (n =", total_sample, ")"),x ="",y ="Number of Responses",caption ="Source: TOPC Cohort 3 Survey (2023)") +theme_minimal() +theme(legend.position ="bottom",legend.box ="vertical",legend.box.margin =margin(0, 0, 0, 0),legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(nrow =1, byrow =TRUE))
Q28. Please describe your racial/ethnic identity. Select all that apply.
Code
# Single-variable frequency plotq28_frequency <-svytable(~q28, design = design)ggplot(as.data.frame(q28_frequency), aes(x = q28, y = Freq)) +geom_bar(stat ="identity", fill ="#041e42", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +labs(title ="Majority are new to working in their city or county governments.",subtitle =paste("Total number of survey respondents (n =", total_sample, ")"),x ="",y ="Number of Responses") +theme_minimal() +theme(legend.position ="bottom",legend.box ="vertical",legend.box.margin =margin(0, 0, 0, 0),legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(nrow =1, byrow =TRUE))
Q29. Which gender do you most closely identify with (or self-describe in ‘other’)?
Code
# Single-variable frequency plotq29_frequency <-svytable(~q29, design = design)ggplot(as.data.frame(q29_frequency), aes(x = q29, y = Freq)) +geom_bar(stat ="identity", fill ="#041e42", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +labs(title ="More than half of respondents identify as male.",subtitle =paste("Total number of survey respondents (n =", total_sample, ")"),x ="",y ="Number of Responses") +theme_minimal() +theme(legend.position ="bottom",legend.box ="vertical",legend.box.margin =margin(0, 0, 0, 0),legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(nrow =1, byrow =TRUE))
# Define the stratastrata <-c("Akron", "Detroit", "Macon", "Miami")# Initialize a data frame to store the results for question q4results_df <-data.frame(Stratum =character(),Median =numeric(),StandardDeviation =numeric(),stringsAsFactors =FALSE)# Calculate stats for question q4 across each stratumfor (stratum in strata) { stratum_design <-subset(design, city_mapped == stratum) median_val <-svyquantile(~q4, stratum_design, 0.5, na.rm =TRUE) sd_val <-sqrt(svyvar(~q4, stratum_design, na.rm =TRUE)) median_q4 <- median_val[1] sd_q4 <- sd_val[1] results_df <-rbind(results_df, data.frame(Stratum = stratum,Median = median_q4,StandardDeviation = sd_q4 ))}# Print the results for q4print(results_df)
Stratum Median.q4.quantile Median.q4.ci.2.5 Median.q4.ci.97.5 Median.q4.se
0.5 Akron 3 NaN NaN NaN
0.51 Detroit 3 NaN NaN NaN
0.52 Macon 3 3 3 0
0.53 Miami 3 NaN NaN NaN
StandardDeviation
0.5 0.5000000
0.51 0.7071068
0.52 0.0000000
0.53 0.5773503
Q5. I learned skills to promote equity and foster inclusive spaces in my work.
Code
# Single-variable frequency plotq5_frequency <-svytable(~q5, design = design)all_levels_df <-data.frame(q5 =factor(1:4, levels =1:4), Freq =integer(4))q5_frequency_df <-merge(all_levels_df, as.data.frame(q5_frequency), by ="q5", all.x =TRUE)q5_frequency_df$Freq <-rowSums(q5_frequency_df[, c("Freq.x", "Freq.y")], na.rm =TRUE)q5_post_frequency_plot <-ggplot(q5_frequency_df, aes(x = q5, y = Freq)) +geom_bar(stat ="identity", fill ="#041e42", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +labs(title ="All respondents agree they learned new skills to promoteequity and inclusion in their work.",subtitle =paste("Rating Scale: 4 - Strongly Agree to 1 - Strongly Disagree (n =", total_sample, ")"),x ="",y ="Number of Responses",caption ="Source: TOPC Cohort 3 Survey (2023)") +theme_minimal() +theme(panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0))print(q5_post_frequency_plot)
Code
ggsave("plots/q5_post_frequency_plot.png", q5_post_frequency_plot)# Cross-tabulation with citiesq5_cross_tab_city <-svytable(~q5 + city_mapped, design = design)q5_levels_ordered <-as.factor(as.data.frame(q5_cross_tab_city)$q5)q5_levels_ordered <-factor(q5_levels_ordered, levels =c("1", "2", "3", "4"))q5_cross_tab_city_df <-as.data.frame(q5_cross_tab_city)q5_cross_tab_city_df$q5 <-factor(q5_cross_tab_city_df$q5, levels =c("1", "2", "3", "4"))q5_post_cross_tab_city_plot <-ggplot(q5_cross_tab_city_df, aes(x = city_mapped, y = Freq, fill = q5)) +geom_bar(stat ="identity", position ="stack", width =0.7) +coord_flip() +expand_limits(y =c(0, total_sample)) +scale_fill_manual(values =c("4"="#041e42", "3"="#6eaddc", "2"="#da291c", "1"="#1B786E"),drop=FALSE) +labs(title ="Majority agree that they learned equity and inclusion skills, andsome strongly agreeing, particularly in Miami and Macon.",subtitle =paste("Rating Scale: 4 - Strongly Agree to 1 - Strongly Disagree (n =", total_sample, ")"),x ="",y ="Number of Responses",caption ="Source: TOPC Cohort 3 Survey (2023)") +theme_minimal() +theme(legend.position ="bottom",legend.title =element_blank(),panel.grid.major.y =element_blank(),panel.grid.minor.y =element_blank(),plot.title =element_text(face ="bold"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(reverse =TRUE, nrow =1, byrow =TRUE))print(q5_post_cross_tab_city_plot)